home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group94a.txt
/
000137_icon-group-sender _Mon May 23 10:24:05 1994.msg
< prev
next >
Wrap
Internet Message Format
|
1994-08-19
|
937b
Received: by cheltenham.cs.arizona.edu; Mon, 23 May 1994 09:39:33 MST
Message-Id: <199405231424.AA17553@optima.cs.arizona.edu>
Date: Mon, 23 May 94 10:24:05 -0400
From: Mark Keil <keil@ch.hp.com>
To: icon-group@cs.arizona.edu
Subject: Japanese text processing & Japanese word counting?
Reply-To: keil@ch.hp.com
Status: RO
Errors-To: icon-group-errors@cs.arizona.edu
Folks:
Has anyone out there used Icon to process japanese text?
I'm looking for pointers or code to handle JIS or EUC encoded
japanese text with Icon.
This is an interesting problem, because regular ascii can be
intermixed with two byte encoded Japanese text. There are several
different ways to encode the japanese, one of which uses shift-in
shift-out (shift JIS) codes to mark the transition. Japanese text
also doesn't have spaces to seperate the words, making word detection
interesting.
Anybody have any ideas?
Thanks, Mark